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Abstract 


This  research  determined  whether  the  provision  of  supplementary  text  would  help  resolve  the 
problem  of  auditory  overload  during  military  command  and  control  operations.  Listeners, 
twenty-four  English-fluent,  normal-hearing,  males  and  females,  were  presented  one  block  of 
78  triads  of  simultaneous  messages  over  a  communication  headset,  under  each  of  twelve  listening 
conditions.  The  messages  were  recordings  made  by  two  males  (left  and  right  ears,  respectively) 
and  a  female.  Listening  conditions  were  defined  by  combinations  of  the  background  (quiet  or 
vehicle  noise),  ear  assignment  of  the  female  talker  (right  or  left  ear),  and  the  provision  of 
supplementary  text  (none,  random  but  equally  likely  across  the  three  talkers  or  associated  with 
one  of  the  three  talkers).  Using  a  computer  keypad,  listeners  encoded  only  those  messages  which 
began  with  a  pre-assigned  call  sign.  These  occurred  once  during  27  of  the  78  triads,  nine  from 
each  of  the  three  talkers.  The  overall  mean  percentage  of  correct  responses  was  78%.  Male  and 
female  listeners  performed  similarly  and  were  equally  intelligible  as  talkers.  There  was  a 
significant  right  ear  advantage  for  discriminating  among  talkers.  Provision  of  text  resulted  in  an 
increase  in  the  percentage  correct  of  1 0-26%  that  did  not  compromise  understanding  of 
unaccompanied  target  messages. 


Significance  to  defence  and  security 


Auditory  overload,  the  problem  of  competing  messages,  is  particularly  challenging  in  military 
command  and  control  where  operators  may  be  tasked  with  monitoring,  responding  to  and  relaying 
messages  arriving  concurrently  over  several  networks  associated  with  different  levels  of 
command.  The  task  may  be  further  complicated  by  the  masking  effect  of  background  noise.  This 
research  showed  that  auditory  overload  may  be  resolved,  in  part,  by  providing  supplementary  text 
for  one  of  the  audio  networks,  without  compromising  the  understanding  of  unaccompanied  audio. 
Male  and  female  listeners  performed  similarly  and  were  equally  intelligible.  The  finding  of  a  right 
ear  advantage  suggests  that  higher  priority  messages  should  be  delivered  to  the  dominant  right 
ear. 
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Resume 


Cette  recherche  a  permis  de  determiner  si  l’ajout  de  texte  supplementaire  pouvait  regler  le 
probleme  de  surcharge  auditive  pendant  les  operations  militaires  de  commandement  et  controle. 
Les  auditeurs,  24  hommes  et  femmes  parlant  couramment  1’ anglais  et  possedant  une  audition 
normale,  ont  ete  soumis  a  une  serie  de  78  blocs  de  trois  messages  simultanes,  transmis  a  un 
casque  d’ecoute,  dans  chacune  des  douze  conditions  d’ecoute.  Les  messages  avaient  ete 
enregistres  par  deux  hommes  (oreille  gauche  et  oreille  droite,  respectivement)  et  une  femme.  Les 
conditions  d’ecoute  etaient  definies  selon  une  combinaison  de  bruits  de  fond  (silence  ou  bruits  de 
vehicules),  designation  de  l’interlocuteur  feminin  (oreille  droite  ou  gauche)  et  d’ajout  de  texte 
supplementaire  ou  non  (aucun,  aleatoire  mais  pouvant  provenir  tout  aussi  bien  de  l’un  des  trois 
interlocuteurs  ou  associe  a  l’un  des  trois  interlocuteurs).  A  l’aide  d’un  clavier  numerique,  les 
auditeurs  ont  encode  uniquement  les  messages  qui  debutaient  par  un  indicatif  preetabli.  Ces 
indicatifs,  donnes  neuf  fois  par  chacun  des  interlocuteurs,  precedaient  27  des  78  blocs  de  trois 
messages.  Le  pourcentage  moyen  global  de  reponses  correctes  etait  de  78%.  Les  auditeurs 
masculins  et  feminins  ont  eu  des  resultats  semblables,  et  etaient  aussi  intelligibles  que  les 
interlocuteurs.  11  y  avait  une  forte  dominance  de  1’ oreille  droite  pour  la  differenciation  entre  les 
interlocuteurs.  L’ajout  de  texte  supplementaire  a  produit  une  augmentation  du  pourcentage  de  10 
a  26%  de  reponses  correctes,  sans  nuire  a  la  comprehension  des  messages  cibles  qui  n’ etaient  pas 
accompagnes. 


Importance  pour  la  defense  et  la  securite 


La  surcharge  auditive,  un  probleme  de  messages  concurrents,  est  particulierement  presente  dans 
les  situations  militaires  de  commandement  et  controle,  ou  Ton  demande  aux  operateurs  de 
surveiller  les  messages  qui  entrent  en  meme  temps  sur  les  nombreux  reseaux  lies  aux  differents 
echelons  de  commandement,  d’y  repondre  et  de  les  relayer.  Cette  tache  peut  etre  davantage 
compliquee  par  l’effet  masquant  des  bruits  de  fond.  Cette  recherche  a  demontre  que  la  surcharge 
auditive  pouvait  etre  corrigee  en  partie  par  l’ajout  de  texte  a  l’un  des  reseaux  audio,  sans  nuire  a 
la  comprehension  des  messages  audio  non  accompagnes.  Les  auditeurs  masculins  et  feminins  ont 
eu  des  resultats  semblables  et  etaient  tout  autant  intelligibles.  La  constatation  d’une  dominance  de 
l’oreille  droite  laisse  croire  que  les  messages  prioritaires  devraient  etre  transmis  a  l’oreille  droite 
dominante. 
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1  Introduction 


The  intent  of  this  research  was  to  determine  whether  the  provision  of  supplementary  text  would 
help  to  resolve  the  problem  of  auditory  overload  during  military  command  and  control  operations. 
Auditory  overload  refers  to  a  situation  in  which  military  members  (e.g.,  radio  operators)  are 
required  to  monitor,  transcribe,  respond  to  and  relay  orders  or  strategic  information  delivered 
simultaneously  over  two  or  more  audio  networks  or  channels.  Audio  channels  may  be  associated 
with  different  levels  of  command  or  units  within  a  battle  space.  The  possible  benefit  of  a  text 
message  in  addition  to  the  audio  for  one  of  a  triad  of  simultaneous  messages  presented  over  a 
communication  headset  in  vehicle  noise  was  explored. 

Previous  studies  have  documented  enhanced  speech  recognition  with  auditory- visual  (AV)  speech, 
compared  with  either  auditory  (A)  or  visual  (V)  presentation  alone  (e.g.,  Grant  et  al.,  1 998;  Grant  and 
Seitz,  1998),  as  well  as  the  benefits  of  multi-modal  communications  (Finomore  et  al.,  2010).  For 
example,  Grant  et  al.  (1998)  tested  participants  with  various  degrees  of  noise-induced  hearing  loss  on 
their  ability  to  recognize  consonants  and  sentences  presented  either  binaurally  over  a  headset  at  a 
comfortable  listening  level,  on  a  video  monitor  or  by  means  of  both  modes  of  communication 
simultaneously.  Items  were  presented  in  a  background  of  speech-spectrum  shaped  noise  at  a 
speech-to-noise  ratio  (SNR)  of  0  dB.  All  the  participants  showed  an  AV  benefit  for  both  types  of 
speech  materials.  In  the  case  of  consonants,  AV  recognition  scores  ranged  from  60-90%, 
compared  with  20-74%  for  A  and  21-40%  for  V  presentations.  The  pattern  of  outcomes  was  the 
same  for  the  sentences.  AV  presentations  resulted  in  recognition  scores  of  23-94%,  compared 
with  A  scores  of  5-70%  and  V  scores  of  0-20%. 

Using  a  different  paradigm  that  modelled  listening  in  a  military  operational  environment, 
Finomore  et  al.  (2010)  investigated  the  benefits  of  a  multi-modal  communications  (MMC)  suite 
that  incorporated  a  standard  radio,  3-D  audio,  a  repeat  function,  and  text-based  messaging  (chat). 
The  MMC  suite  was  compared  with  each  of  monaural  radio  communications,  3D  audio,  and  chat 
for  detecting  and  replying  to  target  messages  delivered  over  several  channels,  over  a  27-minute 
session.  Listeners  pressed  a  push-to-talk  button  to  signify  the  active  channel  and  repeated  back 
the  message  they  heard.  Target  message  detection  and  verbal  response  accuracy  were 
significantly  greater  with  MMC  and  chat  than  with  either  3D  audio  or  the  standard  radio.  In 
contrast,  detection  response  times  were  faster  for  3D  audio  and  the  standard  radio. 

More  recently,  Abel  et  al.  (2012,  2014)  assessed  the  benefit  of  providing  visual  cues  for 
communication  in  a  mock-up  of  the  crew  compartment  of  a  mobile  command  post.  In  the  first  of 
two  studies,  Abel  et  al.  (2012)  presented  sets  of  concurrent  diotic  (same)  or  dichotic  (different) 
messages  over  the  right  and  left  eaiphones  of  a  communication  headset  and  a  different  message 
over  a  four-speaker  array  surrounding  the  head,  in  quiet  or  in  a  background  of  either  noise 
recorded  in  a  land  vehicle  driving  along  a  highway  (Bison  noise),  speech  babble  noise  or  both. 
Speech  babble  simulated  irrelevant  conversation  in  close  proximity.  The  at-ear  SNRs  for 
messages  presented  over  the  headset  and  loudspeakers  were  0  dB  (babble  noise)  and  5  dB  (Bison 
noise).  Using  a  computer  keypad,  normal-hearing,  English-fluent,  listeners  encoded  words 
contained  only  in  messages  beginning  with  a  pre -assigned  call  sign.  They  achieved  close  to 
1 00%  correct  for  the  headset  messages,  either  in  quiet  or  noise,  and  for  the  loudspeaker  messages, 
in  quiet.  The  percentage  correct  was  significantly  less  by  30-35%  when  messages  were  presented 
over  the  loudspeakers  in  the  Bison  noise.  Adding  the  babble  noise  decreased  the  loudspeaker 
percentage  correct  by  an  additional  12%.  A  visual  icon  on  the  listener’s  computer  monitor 
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directing  attention  to  the  source  of  incoming  target  messages  significantly  increased  the 
percentage  correct  for  the  loudspeaker  messages  by  7%.  This  finding  corroborates  results  reported 
by  Best  et  al.  (2007)  that  a  vision  cue  signifying  the  location  of  a  message  and  its  time  of 
occurrence  will  improve  its  identification. 

In  their  second  study,  Abel  et  al.  (2014)  investigated  the  effect  of  replacing  audio  with  either 
visual  or  audio-visual  messages.  Normal-hearing  listeners  completed  two  concurrent  tasks,  again 
either  in  quiet  or  in  the  Bison  noise.  For  Task  1  they  were  presented  dichotic  pairs  of  messages 
with  simultaneous  onsets  over  right  and  left  earphones  of  a  communication  headset.  As  for  the 
first  experiment,  they  encoded  only  those  messages  beginning  with  a  pre-assigned  target  call  sign. 
For  Task  2  they  used  the  computer  keyboard  to  agree  or  disagree  with  the  correctness  of  simple 
mathematical  equations  (e.g.,  4+1=5)  which  occurred  randomly  during  headset  message  pairs  that 
did  not  contain  a  target.  The  equations  were  presented  either  (1)  over  the  four-speaker  array 
surrounding  the  head,  (2)  as  text  on  the  laptop  monitor,  (3)  in  both  the  audio  and  visual  modalities 
simultaneously  or  (4)  not  at  all.  The  tasks  were  of  equal  importance.  Listeners  achieved  an 
average  mean  score  of  at  least  78%  correct  for  dichotic  phrases  presented  over  the  headset  in 
Task  1.  Averaged  across  experimental  conditions,  there  was  a  significant  right  ear  advantage 
of  7%.  The  right  ear  advantage  was  particularly  apparent  in  the  noise  background,  where  the 
interaural  difference  was  12%.  For  Task  2,  accuracy  was  significantly  better  by  20%  when  the 
equations  were  presented  visually  or  audio-visually.  These  findings  point  to  the  importance  of 
delivering  higher  priority  communications  to  the  dominant  (right)  ear,  and  the  advantage  of  using 
text  as  an  adjunct  to  audio  messaging. 

2  Purpose 


The  present  experiment  was  a  follow-on  to  the  work  previously  reported  by  Abel  et  al. 

(2012,  2014).  In  the  two  previous  studies,  it  was  found  that  listeners  had  relatively  little  difficulty 
discriminating  between  and  understanding  a  pair  of  messages  delivered  simultaneously  to  the  two 
ears  over  a  communication  headset  in  background  noise.  Loudspeaker  messages  were  more 
difficult  to  understand,  likely  because  they  had  to  cross  the  headset  barrier  which  may  have 
altered  their  spectra.  Performance  improved  significantly  if  a  visual  icon  directed  attention  to  the 
active  channel  or  the  audio  was  accompanied  by  text.  Questions  of  interest  for  the  present  study 
were  if,  and  the  degree  to  which,  the  understanding  of  headset  messages  would  deteriorate  if 
another  talker  was  added  to  the  mix,  in  one  or  other  ear,  and  whether  the  addition  of  text  for  one 
of  the  talkers  would  be  beneficial  or  detrimental.  Several  other  issues  were  addressed,  in  part  to 
confirm  previous  findings,  and  in  part  because  they  had  not  been  applied  to  communicators 
(e.g.,  radio  operators)  in  military  operational  environments  who  are  subject  to  auditory  overload 
(Abel,  2008).  These  were:  (1)  Do  male  and  female  listeners  differ  in  their  ability  to  understand 
speech?  (Markham  and  Hazan,  2004),  (2)  Does  the  presence  of  noise  exacerbate  auditory 
overload?  (Abel  et  al.,  2012),  (3)  Is  there  an  advantage  to  presenting  target  messages  to  either  the 
left  or  the  right  ear  during  multiple  simultaneous  presentations?  (Kimura,  201 1),  (4)  Are  male  and 
female  talkers  equally  intelligible?  (Ellis  et  al.,  1996),  (5)  Does  the  addition  of  text  improve 
understanding,  compared  with  audio  presentation  alone,  when  there  are  multiple  talkers  using  the 
same  modality?  (Abel  et  al.,  2014),  (6)  Is  there  an  advantage  to  pairing  text  with  a  selected  talker 
or  is  it  as  likely  to  be  advantageous  if  randomly  paired  with  messages  across  talkers?,  and  (7)  If 
supplementary  text  aids  understanding,  does  it  occur  at  the  expense  of  understanding  audio  traffic 
that  is  not  accompanied  by  text?  The  last  two  questions  have  not  been  previously  addressed. 
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3  Methods  and  materials 


3.1  Participants 

Twenty-four  participants,  twelve  males  (aged  26-41  years)  and  twelve  females  (aged 
19-48  years),  with  normal  hearing  thresholds  no  greater  than  15  dB  HL  (decibels,  hearing  level) 
bilaterally  at  the  speech  frequencies,  0.5,  1,  2  and  4  kHz  (Yantis,  1985)  were  recruited  to  serve  as 
listeners  by  means  of  an  email  sent  to  employees  of  military  units  in  the  Toronto  area.  Sixteen 
were  serving  members  of  the  Canadian  Armed  Forces  (CAF)  and  had  some  prior  experience 
communicating  over  tactical  headsets.  The  remaining  eight  were  civilians  with  limited  experience 
with  the  use  of  such  devices.  Since  the  listeners  would  be  tested  in  a  sound  proof  room  for  an 
extended  period  of  time  with  auditory  materials,  they  were  screened  for  a  history  of 
claustrophobia,  the  use  of  medications  that  might  affect  the  ability  to  complete  the  study,  and  ear 
disease,  including  excess  wax  build  up,  hearing  loss  and  tinnitus.  Other  exclusion  criteria 
included  the  inability  to  read  instructions  on  a  laptop  monitor  at  the  test  distance  without  the  use 
of  corrective  glasses.  Glasses  have  been  shown  to  interfere  with  the  fit  of  muff-style  devices  of 
the  type  that  would  be  used  in  the  study  (Abel  et  al.,  2002).  To  control  for  the  effect  of  fluency  on 
speech  understanding,  all  were  required  to  be  proficient  in  speaking  and  understanding  English, 
the  test  language  (van  Wijngaarden  et  al.,  2002).  Only  those  whose  native  language  was  English 
or  those  who  had  learned  English  before  the  age  of  12  years  were  considered  for  the  study 
(Mayo  et  al.,  1977).  They  were  also  required  to  obtain  a  score  of  at  least  85%  on  an  adapted  timed 
(20  minutes)  paper  and  pencil  version  of  a  test  by  The  Skylark  School  of  English  (Skylark 
School,  2012).  All  but  three  reported  that  they  were  right-handed  and  thus  likely  left 
hemisphere/right  ear  dominant  (Foundas  et  al.,  2006;  Kimura,  2011).  One  of  the  three  left  handers 
volunteered  that  previous  tests  had  confirmed  that  he  was  left  hemisphere  dominant. 

3.2  Apparatus 

The  test  facility  has  been  described  previously  (Abel  et  al.,  2012).  Listeners  were  tested 
individually  while  seated  in  front  of  a  laptop  computer  in  a  mock-up  of  a  CAF  land  vehicle,  the 
Bison  Command,  Control,  Communications  and  Intelligence  mobile  command  post  (Bison 
C31  MCP),  in  our  centre’s  Noise  Simulation  Facility  (NSF).  The  NSF  is  a  semi-reverberant  room, 
10.55  metres  (L)  by  6.10  metres  (W)  by  3.05  metres  (H).  An  array  of  loudspeakers  comprising 
four  low-frequency  drivers  (Bass  Tech  7;  ServoDrive  Inc.,  Glenview,  Illinois),  eight 
mid-frequency  drivers  (Gane  G218;  Equity  Sound  Investments  Inc.,  Bloomington,  Indiana),  and 
four  high-frequency  drivers  (DMC  1 152A;  Electro-Voice,  Burnsville,  Minnesota)  occupies  the 
width  of  the  shorter  rear  wall.  These  are  powered  by  fourteen  amplifiers  (8  stereo  model  4B 
and  6  mono  model  7B;  Bryston  Ltd.,  Peterborough,  Ontario).  This  array  allows  the  acoustic 
simulation  of  a  wide  range  of  CAF  environments,  in  terms  of  both  level  and  energy  spectrum,  and 
is  capable  of  producing  levels  in  excess  of  130  dB  SPL  (decibels,  sound  pressure  level).  The 
background  noise  in  this  facility  is  28  dB  SPL. 

All  the  listeners  were  fitted  with  a  communication  headset  (Racal  Slimgard  11  RA108;  Esterline 
Technologies  Coip,  Bellevue,  Washington).  The  Racal  headset  is  currently  used  by  CAF 
personnel  operating  land  vehicles. 
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3.3  Experimental  design 


Each  listener  completed  a  task  under  twelve  different  listening  conditions.  The  task  involved 
listening  to  triads  of  messages  with  simultaneous  onsets,  presented  over  the  communication 
headset.  The  messages  were  spoken  by  two  male  talkers,  one  assigned  to  the  right  ear  (MR),  one 
assigned  to  the  left  ear  (ML)  and  a  female  talker  (F).  The  listening  conditions  were  defined  by 
combinations  of  the  background  (quiet  or  Bison  noise — the  continuous  playback  through  the 
loudspeakers  of  a  digital  recording  of  noise  heard  within  a  Bison  C31  MCP  being  driven  along  a 
highway),  the  ear  assignment  of  the  female  talker  (right  or  left)  and  the  availability  of 
supplementary  text  for  a  subset  of  target  messages  (none  -  N,  random  but  equally  likely  across  the 
three  talkers  -  RA,  or  associated  with  the  messages  spoken  by  one  of  the  three  talkers  -  AT).  In 
the  case  of  the  third  text  option,  the  twenty-four  listeners  were  divided  into  three  subgroups  of 
eight,  for  whom  the  text  was  associated  with  the  target  messages  spoken  by  only  one  of  the  three 
talkers,  ML,  MR  or  F,  respectively.  These  subgroups  were  labelled  ATML  (associated  text  male 
left),  ATMR  (associated  text  male  right)  and  ATF  (associated  text  female).  This  measure  was 
instituted  to  reduce  the  number  of  listening  conditions  within  listener.  The  subgroups  were 
comprised  of  four  males  and  four  females,  selected  randomly  from  the  total  group.  The  full  list  of 
the  independent  and  dependent  variables,  along  with  acronyms  associated  with  levels  of  each,  is 
given  in  Table  1. 

The  messages  were  taken  from  the  Coordinate  Response  Measure  (CRM),  a  non-standardized 
speech  corpus  for  multi-talker  communications  research,  adapted  by  Bolia  et  al.  (2000)  to 
measure  speech  intelligibility  in  military  environments.  Each  message  in  the  corpus  consists  of  a 
recording  of  a  talker  speaking  a  call  sign  following  by  a  colour-number  combination  within  a 
carrier  phrase,  (e.g.,  “Baron  go  to  Blue  Five  now”).  In  all,  there  are  256  messages  in  the  corpus, 
made  up  of  combinations  of  eight  call  signs  (Charlie,  Ringo,  Laker,  Flopper,  Arrow,  Tiger,  Eagle 
and  Baron),  four  colours  (blue,  red,  white  and  green)  and  eight  numbers  (1,  2,  3,  4,  5,  6,  7,  and  8). 
Recorded  lists  spoken  by  four  male  and  four  female  talkers  are  available.  For  the  present  study, 
the  lists  spoken  by  two  of  the  males  and  one  of  the  females  were  used  to  maximize  the  distinction 
across  messages  within  a  triad.  Messages  in  the  corpus  for  each  of  the  three  talkers  were  digitally 
stored  as  a  single  file  on  a  computer  hard  drive,  with  the  carrier  words  removed.  The  duration  of 
each  three-word  message  was  approximately  three  seconds. 

Seventy-eight  triads  of  messages  were  presented  in  each  condition.  The  number  of  triads  was 
constrained  by  the  requirement  to  complete  the  experiment  with  instructions,  practice  and 
debriefing  in  two  two-hour  sessions.  Listeners  were  instructed  to  respond  only  to  those  messages 
which  began  with  a  pre -assigned  call  sign  (the  target  messages).  All  were  assigned  Baron  as  their 
call  sign.  In  a  previous  study,  statistically  significant  differences  in  outcome  were  observed  for 
target  messages  beginning  with  Baron  and  Charlie,  favouring  Baron,  in  spite  of  similarities  in  the 
levels  and  spectra  of  the  messages  they  began  (  Abel  et  al,  2014).  The  difference  in  outcome  may 
have  been  due  to  voicing  (Dubno  et  al.,  1981).  Baron  begins  with  a  voiced  consonant  and  Charlie 
with  a  voiceless  consonant.  In  the  present  experiment,  non-target  messages  began  with  the  call 
sign  Ringo.  Baron  and  Ringo  are  two  of  three  possible  alternatives  (Baron,  Ringo  and  Laker)  in 
the  call  sign  set  that  begin  with  voiced  consonants.  Published  studies  have  shown  that 
consonant-vowel-consonant  words  (CVCs)  beginning  with  “b”  are  less  likely  to  be  confused  with 
CVCs  beginning  with  “r”  than  with  “1”  (Woods  et  al,  2010). 

The  target  call  sign  (Baron)  began  one  of  the  three  messages  (the  target  messages)  on  27  of 
the  78  triads  (34.6%  probability  of  occurrence),  nine  for  each  of  the  three  talkers.  The  probability 
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of  target  messages  is  in  line  with  the  probability  of  occurrence  of  critical  signals  in  vigilance 
experiments  (for  a  review  see  Abel,  2009).  For  the  no  text  condition,  none  of  the  27  target 
messages  was  accompanied  by  text.  In  the  random  text  condition,  text  accompanied  three  of  the 
nine  messages  spoken  by  each  of  the  three  talkers,  randomly  determined.  For  the  one  talker  text 
condition  all  nine  target  messages  from  one  of  the  talkers  were  accompanied  by  text.  As  noted 
above,  the  selected  target  talker  was  counterbalanced  across  listeners,  with  subgroups  of  eight 
assigned  one  of  the  three,  respectively.  A  target  message  could  be  spoken  by  only  one  of  the 
talkers  during  any  triad,  and  never  twice  in  succession  by  the  same  talker.  It  should  be  noted  that 
text  only  accompanied  target  messages.  For  both  target  and  non-target  triads,  the  assignment  of 
the  32  possible  colour-number  pairings  was  random  with  the  restriction  that  a  particular 
colour-number  pairing  could  only  occur  once  within  a  triad.  Except  for  these  restrictions,  they 
were  selected  randomly  and  independently  for  each  listener. 

The  target  messages  were  presented  at  an  at-ear  level  of  70  dB  SPL  and  the  non-target  messages 
at  an  at-ear  level  of  67  dB  SPL.  Intensity  differences  between  talkers  have  been  shown  to  aid 
differentiation  (Drullman  and  Bronkhorst,  2004).  For  the  Bison  noise  conditions,  a  digital 
recording  was  played  over  the  loudspeaker  array  in  the  test  room  (outside  the  mock-up  of  Bison 
C3I MCP)  at  an  at-ear  level  under  the  headset  of  65  dBA  (decibels,  A-weighted).  At  source  the 
level  was  95  dBA  which  is  about  5  dB  lower  than  the  level  measured  inside  a  light  armoured 
vehicle  driving  along  a  highway  (Nakashima  et  al.,  2007).  The  at-ear  SNRs,  2-5  dB,  have 
previously  been  shown  to  result  in  speech  understanding  for  single  talkers  in  the  range  of  60-80% 
(Abel  et  ah,  1990). 

The  listener  was  instructed  to  respond  to  a  target  message  by  pressing  four  responses  keys,  in 
order,  on  a  standard  laptop  computer  keyboard.  These  were  coded  for  the  perceived  ear  (one  of 
two  labeled  keys),  the  talker’s  gender  (one  of  two  labeled  keys),  the  colour  (one  of  four  labeled 
keys)  and  the  number  (one  of  eight  labeled  keys),  respectively.  The  keys  for  each  of  these 
attributes  of  the  message  were  located  centrally  on  different  rows  of  the  keyboard.  No  feedback 
was  given  about  the  correctness  of  the  responses.  The  rate  of  presentation  of  triads,  one  every 
seven  seconds,  was  controlled  by  computer  program.  Pilot  testing  confirmed  that  the  messages 
could  be  heard  distinctly  and  that  listeners  had  adequate  time  to  respond.  It  took  15  minutes  to 
present  each  condition,  including  the  time  to  inform  listeners  of  the  upcoming  condition.  The 
twelve  conditions  were  presented  in  two  sets  of  six,  one  set  for  the  quiet  background  conditions 
and  one  set  for  the  Bison  noise  background  conditions,  during  two  consecutive  sessions  that  were 
no  more  than  one  week  apart.  The  order  of  the  backgrounds  was  counterbalanced  within  male  and 
female  listener  subgroups. 

In  the  text  conditions,  the  components  of  the  selected  target  messages  (e.g.,  “Right  Male  Blue 
Five”)  were  presented  vertically  on  four  separate  lines,  centred  on  the  monitor  of  a  standard 
laptop  computer,  in  Times  New  Roman  14  point  font.  Text  and  audio  onsets  were  simultaneous. 
However,  the  text  was  approximately  one-third  second  longer  than  the  audio  to  allow  sufficient 
time  to  be  read  (Abel,  2014). 
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Table  1:  The  independent  and  dependent  variables. 


INDEPENDENT  VARIABLES 
A.  Within  Subjects: 

Background 

Ear  assignment  for  the  female  talker 
Text 

Target  Talker 


B.  Between  Subjects: 

Gender  of  Listeners 
Job  Type 

Order  of  Backgrounds 
Associated  Text 


-  quiet  or  Bison  noise 

-  female  left  (FL)  or  female  right  (FR) 

-  none  (N),  random  (RA),  or 
associated  with  a  particular  talker  (AT) 

-  male  left  (ML),  male  right  (MR),  female  (F) 


-  males  (N=12)  and  females  (N=12) 

-  military  (N=16)  and  civilian  (N=8) 

-  quiet  first  (N=12)  and  Bison  noise  first  (N=12) 

-  associated  text,  male  left  (ATML,  N=8), 
associated  text,  male  right  (ATMR,  N=8), 
associated  text,  female  (ATF,  N=8) 


DEPENDENT  VARIABLES 

Percentage  of  hits  -  correct  report  of  the  target  message:  ear,  gender  of  the  talker,  colour  and 
number  for  the  27  target  messages  beginning  with  the  call  sign  Baron 
Percentage  correct  reports  of  the  ear  to  which  the  target  message  was  sent 
Percentage  correct  reports  of  the  gender  of  the  talker  of  the  target  message 
Percentage  correct  reports  of  the  colour  component  of  the  target  message 
Percentage  correct  reports  of  the  number  component  of  the  target  message 
Percentage  of  misses,  i.e.,  not  responding  to  a  target  message 

Percentage  of  false  alarms,  i.e.,  responding  to  any  of  the  messages  in  the  5 1  triads  in  which 
there  was  no  target  message 
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3.4  Procedure 


The  protocol  was  approved  by  the  Defence  Research  and  Development  Canada  Human  Research 
Ethics  Committee  (DRDC  HREC).  Volunteers  were  asked  to  review  an  information  sheet  and 
sign  a  consent  form  prior  to  participation.  At  the  start  of  each  session,  they  were  fitted  with  the 
headset  by  a  trained  technician.  They  were  then  presented  a  sample  of  the  Bison  noise,  listened  to 
audio  and  visualized  text  on  the  laptop  monitor  for  sample  messages,  and  practiced  responding. 
Feedback  was  given  for  the  practice  but  not  for  the  experimental  trials.  Before  each  block  of 
78  triads,  listeners  were  informed  of  the  details  of  the  upcoming  condition  on  the  laptop  they  used 
for  responding.  They  were  instructed  that  they  were  permitted  to  use  either  or  both  hands  for 
responding.  Short  breaks  separated  the  six  conditions  presented  during  a  session.  The  total 
duration  of  each  of  the  two  experimental  sessions,  including  the  time  for  instructions,  practice, 
breaks  and  debriefing  was  two  hours. 

4  Results 


The  mean  percentages  and  associated  standard  deviations,  across  listeners,  for  hits,  i.e.,  correctly 
reporting  all  of  the  ear,  gender,  colour  and  number  for  the  target  messages,  are  presented  in 
Table  2,  for  combinations  of  the  background  (quiet  or  Bison  noise),  the  ear  assignment  of  the 
female  talker  (female  left,  FL  or  female  right,  FR),  the  text  condition  (no  text,  N;  random  text, 

RA;  or  text  associated  with  a  talker,  AT)  and  the  target  talker.  With  respect  to  the  last  of  the  text 
options,  results  are  also  presented  separately  for  the  three  subgroups  of  eight  listeners,  ATML, 
ATMR  and  ATF,  for  whom  the  text  was  specifically  associated  with  one  of  the  three  talkers,  ML, 
MR  and  F,  respectively.  The  overall  mean  percentage  of  hits,  averaged  across  conditions,  was  77.5%. 

A  repeated  measures  analysis  of  variance  (ANOVA;  Daniel,  1983)  was  applied  to  the  percentage 
of  hits  obtained  from  all  twenty-four  listeners  for  combinations  of  conditions  defined  by  the 
background,  ear  of  the  female  talker  (FL  and  FR),  text  condition  (N,  RA  and  AT),  and  target 
talker  (ML  MR  and  F),  with  gender  as  the  between  subjects  factor.  The  analysis  showed  that  the 
gender  of  listeners  and  the  text  condition  were  not  significant  factors.  There  were  statistically 
significant  effects  of  the  background  (F1j22=4-78;  p<0.04),  target  talker  (F2,44=7.28;  p<0.002),  ear 
assignment  of  the  female  talker  by  target  talker  (F2-44=15.65;  p<0.0001),  and  background  by  ear 
assignment  of  the  female  talker  by  target  talker  (F2  44=4.66;  p<0.01).  An  unexpected  finding  was 
that  listeners’  percentage  of  hits  was  significantly  higher  in  the  Bison  noise  than  in  quiet  by  6.6% 
(80.8%  vs  74.2%).  Post  hoc  pairwise  comparisons  using  Fisher’s  Least  Significant  Difference 
(LSD)  test  (a  =  0.05)  (Daniel,  1983)  showed  that,  in  the  case  of  the  main  effect  for  the  target 
talker,  the  percentage  of  hits  for  MR  (85.8%)  was  significantly  higher  than  that  for  either  ML 
(73.8%)  or  F  (73.0%)  who  were  no  different  from  each  other.  However,  as  shown  in  Figure  1,  this 
outcome  was  dependent  on  the  ear  assignment  of  F.  When  the  female  talker  was  assigned  to  the 
left  ear,  the  mean  score  was  significantly  higher  for  MR  (93.1%)  than  for  either  ML  or  F  who 
were  not  different  (66.8%  and  69.0%,  respectively).  In  contrast,  when  the  female  talker  was 
assigned  to  the  right  ear,  there  were  no  differences  among  the  three  talkers.  Scores  ranged  from 
76.9%  to  80.7%.  The  score  for  the  female  target  talker  was  relatively  higher  but  only  borderline 
statistically  significant  when  she  was  assigned  to  the  right  ear  compared  with  left  ear 
(76.9%  vs  69.0%). 
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Repeated  measures  ANOVAs,  with  gender  as  the  between  subjects  factor,  were  also  carried  out 
for  the  percentages  correct  for  each  of  the  target  message  ear,  gender,  colour  and  number,  taken 
separately.  The  observed  overall  means  were  90.9%,  90.4%,  82.3%  and  85.5%,  respectively.  The 
pattern  of  results  was  similar  to  that  observed  for  the  percentage  of  hits,  i.e.,  correct  report  of  all 
ear,  gender,  colour  and  number.  Statistically  significant  outcomes  of  the  repeated  measured 
ANOVAs  are  listed  in  Table  3.  For  each  of  the  four  elements  of  the  report,  there  were  significant 
effects  of  the  background  and  interaction  of  the  ear  assignment  of  the  female  talker  by  target 
talker.  In  the  case  of  the  percentage  correct  ear  reports,  there  was  also  a  significant  three-way 
interaction  of  the  ear  assignment  of  the  female  talker  by  text  condition  by  target  talker  (F488=3.09; 
p<0.02).  Follow  up  post  hoc  pairwise  comparisons  using  Fisher’s  LSD  test  (a  =  0.05)  indicated 
that  within  the  ear  assignment  of  the  female  talker,  the  differences  among  the  three  text  conditions 
(N,  RA,  and  AT)  did  not  reach  statistical  significance  for  any  of  the  three  target  talkers. 
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Table  2:  The  percentage  of  hits  (correct  ear,  gender,  colour  and  number)  for  combinations  of  the 
background,  ear  assignment  of  the  female  talker,  text  condition,  and  target  talker. 


Background/Female  Ear  Assignment 

Text  Condition/ 
Talker 

N 

Quiet  Bison  Noise 

Female  Left  Female  Right  Female  Left  Female  Right 

No  Text 

Male  Left 

24 

54.3  (35.9)* 

76.8  (29.5) 

70.4  (29.6) 

79.2  (27.3) 

Male  Right 

95.2  (  9.5) 

68.1  (32.0) 

96.0  (  6.6) 

85.0(19.8) 

Female 

65.4  (31.6) 

73.5  (20.6) 

71.8  (27.1) 

77.6(15.6) 

Total 

71.4(15.9) 

72.8  (19.4) 

79.3  (17.3) 

80.6  (14.7) 

Random  Text 

Male  Left 

24 

64.3  (29.5) 

84.3  (19.2) 

70.8  (27.7) 

79.1  (24.5) 

Male  Right 

88.1  (18.6) 

74.1  (29.7) 

93.8  (13.3) 

85.8  (22.4) 

Female 

66.2  (30.7) 

75.9  (21.4) 

72.7  (29.4) 

83.7(15.5) 

Total 

72.8  (16.4) 

78.1  (16.2) 

79.0  (20.4) 

82.8  (13.8) 

Assoc  Text 

Male  Left 

24 

66.3  (36.2) 

81.2  (27.3) 

75.0  (31.7) 

83.4  (23.7) 

Male  Right 

92.4(15.8) 

72.8  (31.4) 

93.3  (14.4) 

84.8  (22.9) 

Female 

64.9  (35.4) 

71.8  (30.1) 

73.2  (28.0) 

79.0(17.3) 

Total 

74.6  (16.4) 

75.2  (19.0) 

80.3  (13.2) 

82.3  (12.0) 

Assoc  Text,  Male  Left 
Male  Left 

8 

94.3  (12.2) 

95.6  (  8.6) 

94.0  (  6.4) 

91.5  (16.8) 

Male  Right 

81.6  (23.2) 

67.4  (24.6) 

87.4  (23.6) 

82.8  (18.9) 

Female 

58.0  (40.9) 

75.9  (22.6) 

59.3  (35.4) 

80.0(17.7) 

Total 

77.9(18.1) 

79.8  (10.6) 

80.0(14.9) 

84.8  (12.7) 

Assoc  Text,  Male  Risht 
Male  Left 

8 

59.1  (37.6) 

72.0  (36.2) 

77.0(17.6) 

84.3  (17.0) 

Male  Right 

97.1  (  8.1) 

81.8  (35.1) 

97.0  (  5.6) 

92.8  (12.1) 

Female 

58.0  (37.8) 

57.8  (29.8) 

73.3  (25.4) 

77.1  (17.8) 

Total 

71.5  (16.3) 

70.4  (20.0) 

82.4(11.5) 

84.5  (  9.5) 

Assoc  Text,  Female 
Male  Left 

8 

45.4  (36.0) 

76.0  (26.3) 

54.0  (45.0) 

74.5  (33.3) 

Male  Right 

98.5  (  4.2) 

69.1  (35.5) 

95.5  (  6.2) 

78.8  (33.2) 

Female 

78.8  (26.3) 

81.9  (34.7) 

87.0(15.3) 

80.0(18.7) 

Total 

74.0(16.3) 

75.5  (24.9) 

78.6  (14.4) 

77.8  (13.6) 

*Mean  (standard  deviation) 
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100  T 


■  ML 

■  MR 


Ear  Assignment  of  the  Female  Talker 


Figure  1:  The  interaction  of  the  ear  assignment  of  the  female  talker 
and  the  target  talker  on  the  percentage  of  hits  (N=24). 
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Table  3:  Summary  of  statistically  significant  outcomes  for  the  repeated  measures  ANOVAs  on  the 
percentage  of  hits  and  the  percentages  correct  for  each  of  ear,  gender,  colour  and  number. 


Analysis 

Significant  Outcome 

F 

P< 

Hits 

Background 

4.79 

0.04 

Target  Talker 

7.28 

0.002 

Female  Ear  by  Target  Talker 

15.65 

0.0001 

Background  by  Female  Ear  by  Target  Talker 

4.66 

0.01 

Ear 

Background 

11.31 

0.003 

Female  Ear  by  Target  Talker 

7.58 

0.003 

Female  Ear  by  Text  Condition  by  Target  Talker  3.09 

0.02 

Gender 

Background 

15.42 

0.001 

Female  Ear 

6.81 

0.02 

Target  Talker 

3.12 

0.05 

Female  Ear  by  Target  Talker 

8.72 

0.001 

Background  by  Female  Ear  by  Target  Talker 

5.88 

0.005 

Colour 

Background 

7.66 

0.01 

Target  Talker 

8.46 

0.001 

Female  Ear  by  Target  Talker 

15.91 

0.001 

Background  by  Female  Ear  by  Target  Talker 

5.21 

0.01 

Number 

Background 

11.16 

0.003 

Target  Talker 

3.39 

0.04 

Female  Ear  by  Target  Talker 

10.59 

0.0001 

Background  by  Female  Ear  by  Target  Talker 

4.12 

0.02 

4.1  The  effect  of  background 

In  order  to  shed  light  on  the  statistically  significant  beneficial  effect  of  the  Bison  noise  compared 
with  quiet,  a  repeated  measures  ANOVA  was  applied  to  the  percentage  of  hits,  with  order  of  the 
two  backgrounds  (quiet  conditions  first  versus  Bison  noise  conditions  first)  as  the  between 
subjects  factor  rather  than  gender.  The  results  showed  statistically  significant  effects  of  the 
background  (F|22=9.52;  p<0.005),  target  talker  (F2;44=7.41 ;  p<0.002),  background  by  order 
(Fi, 22=22. 71;  p<0.0001),  ear  assignment  of  the  female  talker  by  target  talker  (F2,44=14.75; 
p<0.0001),  background  by  ear  assignment  of  the  female  talker  by  target  talker  (F2,44=5.34; 
p<0.008),  and  background  by  ear  assignment  of  the  female  talker  by  target  talker  by  order 
(F2,44=3.34;  p<0.04).  The  significant  two-way  interaction  of  background  by  order  is  displayed  in 
Figure  2,  averaged  across  the  other  variables.  These  data  show  that  listeners’  scores  were 
relatively  greater,  on  average,  for  the  background  condition  that  was  presented  second,  regardless 
of  whether  it  was  the  quiet  or  Bison  noise,  suggesting  that  practice  was  a  possible  determinant  of 
outcome  rather  than  a  beneficial  effect  of  the  noise.  Post  hoc  pairwise  comparisons  using  Fisher’s 
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LSD  test  (a<0.05)  indicated  that  the  percentage  of  hits  for  the  quiet  condition  presented  first  was 
significantly  less  (67.9%)  than  the  percentages  of  hits  for  the  quiet  condition  presented  second 
(80.5%),  and  the  Bison  noise  condition  presented  first  (76.9%)  or  second  (84.7%)  which  were  no 
different. 
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Figure  2:  The  effect  of  order  of  the  quiet  and  Bison  noise  backgrounds 
on  the  percentage  of  hits  (N=12). 


4.2  Civilian  versus  military  listeners 

Sixteen  of  the  listeners  were  military  members  and  had  some  on-the-job  training  in  radio 
communications.  Eight  were  civilians  whose  only  experience  was  previous  participation  in 
experiments  involving  auditory  perception.  Results  for  the  two  groups  were  compared  using  the 
nonparametric  Independent-Samples  Mann-Witney  U  test  (Conover,  1980).  The  test  was  applied 
to  the  sum  of  the  percentages  of  hits  observed  for  all  three  talkers,  for  each  of  the  twelve 
combinations  of  the  background,  ear  assignment  of  the  female  talker  and  text  (N,  R  and  AT).  In 
none  of  these  twelve  conditions  was  there  a  statistically  significant  difference  between  the 
military  and  civilian  groups. 

4.3  Associating  text  with  a  target  talker 

In  previous  ANOVAS,  the  three  subgroups,  (ATML,  ATMR  and  ATF)  for  whom  text  associated 
with  the  male  left  (ML),  male  right  (MR)  or  female  (F)  talker,  respectively,  were  treated  as  one 
group  of  twenty-four  listeners,  all  of  whom  completed  the  task  with  no  text,  random  text  or  text 
associated  with  a  talker.  The  results  indicated  that  the  text  condition  was  not  a  significant  factor. 
A  subsequent  repeated  measures  ANOVA  was  carried  out  comparing  the  effect  of  the  text 
condition  within  each  of  the  three  associated  text  subgroups  as  the  between  subjects  factor.  The 
results  showed  that  the  three  subgroups,  ATML,  ATMR  and  ATF,  were  not  different  as  a  main 
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effect.  There  were  statistically  significant  effects  of  the  background  (Fi  2i=5.32;  p<0.03),  target 
talker  (F2,42=8.44;  p<0.001),  ear  assignment  of  the  female  talker  by  target  talker  (F2i42=15.22; 
p<0.0001),  and  background  by  ear  assignment  of  the  female  talker  by  target  talker  (F2  42=4.67; 
p<0.02),  as  in  previous  analyses,  as  well  as  the  text  condition  by  target  talker  by  the  associated 
text  subgroup  (F8  84=3.84;  p<0.001). 

The  significant  three-way  interaction,  text  condition  by  target  talker  by  associated  text  subgroup, 
is  displayed  as  two  two-way  interactions  in  Figures  3  and  4.  Figure  3  compares  the  results  for  the 
three  target  talkers,  ML,  MR,  and  F  within  each  of  the  three  associated  text  subgroups,  ATML, 
ATMR  and  ATF,  averaged  across  the  text  condition.  Figure  4  compares  the  three  text  conditions 
(none,  random  and  associated)  within  each  of  the  associated  text  subgroups,  averaged  across  the 
target  talker.  With  respect  to  the  data  in  Figure  3,  post  hoc  pairwise  comparisons  indicated  that 
when  ML  was  accompanied  by  text  in  subgroup  ATML,  the  percentage  of  hits  was  significantly 
higher  than  when  not  accompanied  (93.8%  compared  with  73.1%  in  the  case  of  ATMR  and 
62.5%  in  the  case  of  ATF).  When  MR  was  accompanied  by  text  in  subgroup  ATMR,  the 
percentage  of  hits  was  significantly  higher  than  when  not  accompanied  (92.2%  compared  with 
79.8%  in  the  case  of  ATML  but  similar  at  85.5%  in  the  case  of  ATF).  When  F  was  accompanied 
by  text  in  subgroup  ATF,  the  percentage  of  hits  was  relatively  (but  not  significantly)  greater  than 
when  not  accompanied  (81.9%  compared  with  68.3%  and  66.5%  in  the  cases  of  ATML  and 
ATMR,  respectively).  In  summary,  the  gain  amounted  to  26%  for  the  male  left  target  talker, 

10%  for  the  male  right  target  talker  and  15%  for  the  female  target  talker.  With  respect  to  the  data 
in  Figure  4,  within  each  of  the  three  associated  text  subgroups,  the  percentage  of  hits  was 
relatively  higher  when  a  particular  talker  was  accompanied  by  text  than  in  the  no  text  and  random 
text  conditions.  Differences  ranged  from  9-15%.  Post  hoc  pairwise  comparisons  indicated  that 
only  in  one  instance  did  the  difference  reach  statistical  significance,  the  difference  between  no 
text  and  associated  text  for  the  female  talker  in  the  ATF  subgroup. 


ATML  ATMR 


ATF 


Associated  Text  Subgroup 


Figure  3:  The  effect  of  associated  text  on  the  percentage  of  hits 
for  the  three  target  talkers  (N=8). 
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ATML  ATMR 

Associated  Talker  Subgroup 


ATF 


Figure  4:  The  effect  of  the  text  condition  on  the  percentage  of  hits  within 
each  associated  text  subgroup  (N=8). 


4.4  The  prevalence  of  false  alarms  and  misses 

Table  4  shows  the  percentage  of  false  alarms,  i.e.,  responding  to  a  non-target  message.  For 
combinations  of  the  background,  ear  assignment  of  the  female  talker  and  the  text  condition,  these 
were  no  greater  than  4%,  averaged  across  listeners  and  talkers.  Standard  deviations  associated 
with  the  means  were  no  greater  than  9%.  The  mean  percentage  of  misses  in  each  of  the 
experimental  conditions,  averaged  across  listeners  is  shown  in  Table  5.  Mean  values  were 
relatively  low  at  12%  or  less,  except  for  four  cases  in  the  quiet  condition  when  the  female  talker 
was  assigned  to  the  left  ear  (FL),  and  the  talker  was  ML.  These  were  the  no  text  (17.9%),  random 
text  (13.8%),  associated  text  male  right  (16.6%)  and  associated  text  female  (17.9%)  conditions. 
The  standard  deviations  associated  with  these  means  were  relatively  high,  ranging  from  19.8%  to 
34.6%,  compared  with  most  of  those  observed  for  the  other  conditions.  These  data  indicate  that 
listeners  had  relatively  little  difficulty  in  discriminating  target  from  non-target  messages. 
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Table  4:  The  percentage  of false  alarms  for  the  various  listening  conditions. 


Text  Condition 

N 

Quiet 

Female  Left  Female  Right 

Bison  Noise 

Female  Left  Female  Right 

No  Text 

24 

3.5  (9.0)* 

3.4  (8.4) 

2.0  (7.2) 

0.8  (2.8) 

Random  Text 

24 

3.8  (6.3) 

3.3  (8.7) 

1.2  (3.6) 

1.5  (6.7) 

Assoc  Text 

24 

2.1  (5.3) 

2.8  (5.7) 

1.6  (5.6) 

1.2  (3.0) 

Assoc  Text,  Male  Left 

8 

1.1  (2.4) 

4.0  (7.8) 

1.6  (4.6) 

0.9  (1.7) 

Assoc  Text,  Male  Right 

8 

1.3  (2.4) 

1.9  (2.8) 

0.0  (0.0) 

1.0  (2.4) 

Assoc  Text,  Female 

8 

3.9  (8.7) 

2.4  (5.9) 

3.1  (8.8) 

1.6  (4.6) 

*Mean  (standard  deviation) 
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Table  5:  The  percentage  of  misses  for  the  various  listening  conditions. 


Background/Female  Ear 


Text  Condition/  Quiet  Bison  Noise 


Talker  N 

Female  Left 

Female  Right 

Female  Left 

Female  Right 

No  Text  24 

Male  Left 

17.9(29.8)* 

6.0(16.8) 

4.1  (13.7) 

2.3  (11.2) 

Male  Right 

0.0  (  0.0) 

4.1  (10.2) 

0.0  (  0.0) 

0.5  (  2.2) 

Female 

11.9(15.5) 

2.3  (  4.6) 

7.4  (22.1) 

0.0  (  0.0) 

Total 

9.8(11.9) 

3.9  (  6.4) 

3.7  (  8.6) 

0.9  (  3.7) 

Random  Text  24 

Male  Left 

13.8(19.8) 

0.5  (  2.2) 

7.8  (17.6) 

4.1  (15.8) 

Male  Right 

0.5  (  2.2) 

1.4  (  4.9) 

0.9  (  3.1) 

0.9  (  3.1) 

Female 

5.5  (  9.2) 

4.6(11.7) 

5.5  (16.2) 

0.0  (  0.0) 

Total 

6.4  (  7.2) 

2.0  (  4.4) 

4.6(10.8) 

1.6  (  5.2) 

Assoc  Text  24 

Male  Left 

12.0(25.5) 

3.7  (  9.5) 

4.1  (12.9) 

4.1  (18.0) 

Male  Right 

0.5  (  2.2) 

3.2  (  6.1) 

0.5  (  2.2) 

0.9  (  3.1) 

Female 

10.1  (15.5) 

4.6  (14.1) 

4.1  (  7.8) 

0.5  (  2.2) 

Total 

7.2(10.8) 

3.6  (  6.2) 

2.7  (  4.5) 

1.7  (  5.9) 

Assoc  Text  Male  Left  8 

Male  Left 

1.4  (  3.9) 

1.4  (  3.9) 

0.0  (  0.0) 

0.0  (  0.0) 

Male  Right 

1.4  (  3.9) 

4.1  (  8.2) 

0.0  (  0.0) 

1.4  (  3.9) 

Female 

9.6(14.9) 

0.0  (  0.0) 

8.3  (11.4) 

0.0  (  0.0) 

Total 

3.8  (  4.8) 

1.8  (  3.9) 

2.6  (  3.6) 

0.4  (  1.1) 

Assoc  Text,  Male  Risht  8 

Male  Left 

16.6(34.6) 

5.5  (15.6) 

0.0  (  0.0) 

1.4  (  3.9) 

Male  Right 

0.0  (  0.0) 

2.8  (  5.1) 

0.0  (  0.0) 

0.0  (  0.0) 

Female 

12.4(18.1) 

8.3  (19.3) 

4.1  (  5.7) 

1.4  (  3.9) 

Total 

9.4(14.3) 

5.1  (  6.9) 

1.1  (  1.6) 

0.8  (  1.4) 

Assoc  Text,  Female  8 

Male  Left 

17.9(26.9) 

4.1  (  5.7) 

12.4(20.7) 

11.0  (31.1) 

Male  Right 

0.0  (  0.0) 

2.8  (  5.1) 

1.4  (  3.9) 

1.4  (  3.9) 

Female 

8.3  (15.3) 

5.5  (15.6) 

0.0  (  0.0) 

0.0  (  0.0) 

Total 

8.5(11.8) 

3.9  (  7.5) 

4.4  (  6.7) 

4.0(10.2) 

*Mean  (standard  deviation) 
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5  Discussion 


Listeners  in  the  present  study  had  relatively  little  difficulty,  on  average,  completing  the  auditory 
overload  task.  The  overall  mean  percentage  of  hits  (correct  report  of  all  of  the  ear  and  gender  of 
the  talker,  along  with  the  colour  and  number  of  the  target  message),  averaged  across  the  quiet  and 
Bison  noise  backgrounds,  ear  assignment  of  the  female  talker,  text  condition  and  target  talker  was 
78%.  This  outcome  is  in  the  range  reported  by  Abel  et  al.  (2014)  for  dichotic  headset  presentation 
of  messages  in  quiet  and  Bison  noise.  Neither  the  gender  nor  the  occupation  (military  or  civilian) 
of  the  listeners  were  significant  determinants  of  outcome.  The  outcomes  for  recorded  male  and 
female  talkers  were  no  different.  These  data  suggest  that  there  would  be  no  advantage  to 
recording  warning  messages  in  either  a  male  or  female  voice  for  presentation  over 
communication  systems  during  military  operations  or  for  using  gender  as  a  selection  criterion  for 
operators. 

The  results  of  the  present  study  are  generally  consistent  with  previous  studies  showing  that 
gender  is  not  a  significant  determinant  of  speech  understanding  (e.g.,  Ericson  and  McKinley, 
1997;  Ellis  et  al.,  1996;  Markham  and  Hazan,  2004).  For  example,  Ellis  et  al.  (1996)  reported 
that  there  was  no  significant  difference  between  adult  male  and  female  listeners’  magnitude 
estimation  judgments  of  the  intelligibility  of  taped  utterances  of  speech  samples  by  males  and 
females.  However,  females’  subjective  impression  was  that  male  voices  were  more 
understandable  and  males’  subjective  impression  was  that  female  voices  were  more  intelligible, 
possibly  reflecting  personal  gender  bias.  Markham  and  Hazan  (2004)  investigated  the 
intelligibility  of  words  recorded  by  adult  males  and  females  and  13 -year-old  children,  as  a 
function  of  listeners’  gender  and  age  (7-8  years,  1 1-12  years  and  adult).  Listener  gender  was  not 
a  significant  factor.  Although  women  were  slightly  more  intelligible  than  men,  the  authors  argued 
that  the  specific  acoustic-phonetic  characteristics  of  the  individual  talker  were  the  more  likely 
determinant  of  the  outcome.  The  gender  interrelationship  of  the  talker  and  listener  was  not 
statistically  significant.  In  contrast,  Ericson  and  McKinley  ( 1 997)  reported  that  in  quiet,  female 
talkers  tended  to  mask  each  other  more  than  male  talkers  and  mixed  gender  pairs. 

An  unexpected  finding  was  that  listeners  performed  significantly  better  in  the  Bison  noise  than  in 
quiet  by  7%.  Our  previous  studies  have  shown  that  a  noise  background  can  either  have  no  effect 
or  be  detrimental  to  speech  understanding,  depending  on  the  SNR  and  the  spectrum  of  the  speech 
relative  to  the  noise  (Abel  et  al.,  1990;  Abel  et  al.,  2012).  An  analysis  of  order  effects  revealed 
that  the  outcome  was  possibly  due  to  the  lower  average  percentage  of  hits  for  the  quiet  conditions 
when  they  were  presented  before  the  noise  conditions.  When  the  noise  conditions  were  first,  the 
average  percentage  of  hits  for  the  quiet  conditions  was  higher,  although  the  difference  was  not 
statistically  significant.  The  pattern  of  outcomes  suggests  that  the  apparent  beneficial  effect  of  the 
noise  may  have  been  the  result  of  practice  rather  than  enhanced  intelligibility. 

It  was  also  found  that  the  percentage  of  hits  was  significantly  higher  for  the  male  right  target 
talker  than  for  the  male  left  and  female  target  talkers  whose  percentages  of  hits  were  not  different 
from  each  other.  This  could  be  the  result  of  differences  in  speech  clarity  or  accent.  Care  was  taken 
to  select  recordings  from  among  the  four  available  male  talkers  that  sounded  similar.  Spectral 
analysis  of  the  speech  waveforms  from  the  three  talkers  at  the  left  and  right  ears  confirmed  that 
they  were  similar  from  250  Hz  to  8  kHz  (see  Figure  5).  Small  differences  in  measurement  in  the 
order  of  5  dB  were  likely  due  to  right-left  differences  in  the  placement  of  the  earphones  of  the 
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headset  on  the  manikin  head  (Paquier  et  al.,  2012).  At  the  speech  frequencies,  500  Hz  to  4  kHz, 
the  level  of  the  Bison  noise  was  either  at  or  below  the  level  of  the  speech.  The  possibility  that  ear 
dominance  accounted  for  the  higher  percentage  of  hits  for  the  male  right  talker  is  discussed 
below. 


Figure  5:  A  comparison  of  the  energy  spectra  for  the  target  messages 
and  the  Bison  noise,  separately  for  left  and  right  ears. 

One  of  the  questions  of  interest  was  whether  the  ear  assignment  of  the  female  talker,  left  ear  or 
right  ear,  would  affect  listeners’  ability  to  correctly  respond  to  target  messages  delivered  by  the 
three  talkers.  It  was  expected  that  there  woidd  be  a  relative  advantage  for  the  male  talker  who  was 
not  accompanied  by  the  female  talker.  The  results  showed  that,  indeed,  when  the  female  was 
assigned  to  the  left  ear,  the  percentage  of  hits  was  significantly  greater  for  the  male  right  talker 
than  for  either  the  male  left  talker  or  female  by  25%.  However,  when  the  female  talker  was 
assigned  to  the  right  ear,  the  percentages  of  hits  were  similar  for  the  three  talkers,  ranging  from 
77-81%.  This  pattern  of  results  was  the  same  for  each  of  the  components  of  the  message  i.e.,  the 
ear,  gender,  colour  and  number,  taken  separately.  The  attribution  of  right-ear  superiority  to 
explain  the  outcome  is  supported  by  the  finding  that  the  percentage  of  hits  was  also  relatively 
greater  for  the  female  target  talker  when  she  was  assigned  to  the  right  rather  than  the  left  ear. 
These  findings  point  to  a  right  ear  advantage  in  processing  speech  that  corroborates  previous 
research  on  right  ear  dominance  in  right-handers  (Foundas  et  al.,  2006;  Kimura,  2011).  It  also 
supports  the  conclusion  that  the  dominant  right  ear  is  better  than  the  left  at  discriminating  among 
messages.  The  outcome  suggests  that  higher  priority  messages  should  be  delivered  to  the  right  ear 
of  right-handed  operators. 

Missed  targets  and  false  alarms  proved  to  be  relatively  rare.  Misses  were  greater  than  12%  only  in 
the  case  of  target  messages  delivered  by  the  male  left  talker,  in  quiet  but  not  in  the  Bison  noise, 
when  the  female  talker  was  also  assigned  to  the  left  ear.  In  the  reverse  situation,  the  female  talker 
in  the  right  ear,  the  prevalence  of  misses  was  relatively  small  at  less  than  4%  for  the  male  right 
talker.  This  outcome  supports  the  conclusion  stated  above  that,  in  situations  of  auditory  overload, 
the  right  ear  would  be  better  at  handling  overlapping  messages  from  different  networks. 
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Supplementary  text  (auditory-visual  messaging)  proved  to  be  beneficial.  Listeners  achieved 
relatively,  although  not  significantly,  higher  percentages  of  hits  when  the  target  talkers  were 
accompanied  by  text  than  when  they  were  not.  The  gain  amounted  to  26%  for  the  male  left  target 
talker,  10%  for  the  male  right  target  talker  and  15%  for  the  female  target  talker.  Within  each  of 
the  associated  target  talker  subgroups,  the  provision  of  supplementary  text  resulted  in  relatively 
better  performance  than  no  text  or  random  text  by  9-15%.  Although  text  improved  outcome,  the 
percent  correct  associated  with  the  talkers  who  were  not  accompanied  by  text  never  dipped  below 
62.5%,  showing  that  text  for  one  talker  did  not  result  in  inattention  to  target  messages  delivered 
by  the  unaccompanied  talkers.  The  finding  of  auditory-visual  benefit  corroborates  earlier  studies 
by  Grant  and  co-workers  (Grant  et  al.,  1998)  on  the  benefits  of  combining  auditory  and  visual 
inputs.  It  also  supports  recent  findings  of  the  utility  of  multi-modal  communications  in  command 
and  control  environments  (Finomore  et  al.,  2011).  The  outcomes  of  the  present  study  further 
suggest  the  value  of  adding  text  consistently  for  a  selected  talker,  given  simultaneous  messages 
from  different  networks  in  situations  characterized  by  auditory  overload. 

6  Conclusions 


In  answer  to  the  questions  posed  at  the  outset,  this  study  showed  that  in  situations  characterized 

by  the  simultaneous  delivery  of  audio  messages  over  multiple  communication  channels, 

i.e.,  auditory  overload  during  military  operations: 

1.  Male  and  female  listeners  performed  similarly,  as  did  listeners  engaged  in  military  and 
civilian  occupations. 

2.  Significantly  higher  scores  found  in  the  Bison  noise  for  a  speech-to-noise  ratio  of  5  dB  were 
possibly  due  to  practice. 

3.  Averaged  across  backgrounds,  ear  assignment  of  the  female  talker  and  text  conditions,  there 
was  an  advantage  of  12%  for  the  male  talker  assigned  to  the  right  ear,  and  a  relative 
advantage  of  8%  for  the  female  target  talker  when  she  was  assigned  to  the  right  ear.  The  right 
ear  was  also  better  than  the  left  at  distinguishing  between  talkers. 

4.  Male  and  female  talkers  were  equally  intelligible. 

5.  Listeners’  ability  to  understand  target  messages  was  relatively  greater  when  text  accompanied 
audio  presentations,  but  only  when  associated  with  a  selected  talker’s  target  messages  rather 
than  randomly  across  talkers. 

6.  Improvements  due  to  the  provision  of  supplementary  text  for  target  messages  from  one  talker 
did  not  result  in  inattention  to  target  messages  that  were  not  accompanied  by  text  from  either 
the  same  or  other  talkers. 
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